Module 4: Factorial Treatment Structure

Analyzing a Factorial: Model, ANOVA, and Decision Flow

Example 4.2: Bakery

Treatment Structure: 3 x 2 Full Factorial

  • Shelf Height (Bottom, Middle, Top)
  • Shelf Width (Regular, Wide)

Design Structure: CRD with r = 2 stores per treatment combination.

Response: Bread Sales

What changes when we add a factor?

  • We add:
    • Another main effect
    • An interaction term
  • We must decide what effects are present before interpreting the treatment means.

The goal of the analysis is to decide which effects matter.

The Treatment Effects Model (Two-way Factorial)

\[y_{ijk}=\mu+\alpha_i+\beta_j+\alpha\beta_{ij}+\epsilon_{ijk} \text{ with } \epsilon_{ijk} \sim \text{ iid }N(0,\sigma^2)\]

\[\text{for } i=1,2,3,…,a; j=1,2,…,b; k=1,2,….,r\]

  • \(y_{ijk}\): is the response (sales) from the \(k^{th}\) experimental unit (store) using the \(i^{th}\) level of Factor A (height) and \(j^{th}\) level of Factor B (width) combination
  • \(\mu\): is the grand/overall mean sales
  • \(\alpha_i\): is the effect of the \(i^{th}\) level of factor A (height)
  • \(\beta_j\): is the effect of the \(j^{th}\) level of factor B (width)
  • \(\alpha\beta_{ij}\): is the interaction effect between the \(i^{th}\) level of A (height) and \(j^{th}\) level of B (width)
  • \(\epsilon_{ijk}\): the experimental error associated with the \(k^{th}\) experimental unit (store) using the \(i^{th}\) level of Factor A (height) and \(j^{th}\) level of Factor B (width) combination

ANOVA for Two-way Factorials

Just like one-way ANOVA: \(SST = SSTrt + SSE\)

For a factorial: \(SSTrt = SSA + SSB + SSAB\)

  • Main effect of factor A (Height): \(SSA = rb\sum_i(\bar y_{i\cdot\cdot}-\bar y_{\cdot\cdot\cdot})^2\)
  • Main effect of factor B (Width): \(SSB = ra\sum_j(\bar y_{\cdot j\cdot}-\bar y_{\cdot\cdot\cdot})^2\)
  • AB Interaction Effect (Height x Width): \(SSAB = SST - SSE - SSA - SSB\)

Full ANOVA Table (Two-way)

Source (SV) df SS MS F
A \(a-1\) SSA MSA \(=\frac{SSA}{a-1}\) \(\frac{MSA}{MSE}\) Do avg sales differ across shelf heights?
B \(b-1\) SSB MSB \(=\frac{SSB}{b-1}\) \(\frac{MSB}{MSE}\) Do avg sales differ across shelf widths?
AB \((a-1)(b-1)\) SSAB MSAB \(=\frac{SSAB}{(a-1)(b-1)}\) \(\frac{MSAB}{MSE}\) Does the effect of width depend on height?
Error: e.u.(AB) \((r-1)ab\) SSE MSE \(=\frac{SSE}{(r-1)ab}\)
Total \(N-1\) SST

Decision Flowchart

What are we testing?

Interaction

\[H_0:\text{ All } \alpha\beta_{ij} = 0 \text{ vs } H_A: \text{At least one } \alpha\beta_{ij} \ne 0\]

Main Effect of A

\[H_0:\text{ All } \alpha_{i} = 0 \text{ vs } H_A: \text{At least one } \alpha_{i} \ne 0\]

Main Effect of B

\[H_0:\text{ All } \beta_{j} = 0 \text{ vs } H_A: \text{At least one } \beta_{j} \ne 0\]

Example 4.2: Skeleton ANOVA

Source of Variation DF = 12 stores - 1 = 11 total df

R: Fitting the Model

Let’s prep the data…

bakery_data <- read_csv("data/04_bakery_data.csv") %>% 
  mutate(height = factor(height, levels = c("bottom", "middle", "top")),
         width = factor(width, levels = c("regular", "wide")))
head(bakery_data)
# A tibble: 6 × 4
  height width   sales placement    
  <fct>  <fct>   <dbl> <chr>        
1 bottom regular    47 bottomregular
2 bottom regular    43 bottomregular
3 bottom wide       46 bottomwide   
4 bottom wide       40 bottomwide   
5 middle regular    62 middleregular
6 middle regular    68 middleregular
levels(bakery_data$height)
[1] "bottom" "middle" "top"   
levels(bakery_data$width)
[1] "regular" "wide"   

R: Fitting the Model

options(contrasts = c("contr.sum", "contr.poly"))
bakery_mod <- lm(sales ~ height + width + height:width, data = bakery_data)
anova(bakery_mod)
Analysis of Variance Table

Response: sales
             Df Sum Sq Mean Sq F value    Pr(>F)    
height        2   1544  772.00 74.7097 5.754e-05 ***
width         1     12   12.00  1.1613    0.3226    
height:width  2     24   12.00  1.1613    0.3747    
Residuals     6     62   10.33                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(bakery_mod)

Call:
lm(formula = sales ~ height + width + height:width, data = bakery_data)

Residuals:
   Min     1Q Median     3Q    Max 
    -3     -2      0      2      3 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)      51.000      0.928  54.959 2.44e-09 ***
height1          -7.000      1.312  -5.334  0.00177 ** 
height2          16.000      1.312  12.192 1.85e-05 ***
width1           -1.000      0.928  -1.078  0.32261    
height1:width1    2.000      1.312   1.524  0.17835    
height2:width1   -1.000      1.312  -0.762  0.47494    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.215 on 6 degrees of freedom
Multiple R-squared:  0.9622,    Adjusted R-squared:  0.9308 
F-statistic: 30.58 on 5 and 6 DF,  p-value: 0.0003384

JMP: Fitting the Model

Analyze > Fit Model > Assign Y = Response + Highlight both treatment factors and click Macros > Full Factorial

JMP: Fitting the Model

Response > Expanded Estimates

Recall, “sum to zero”

\(\hat\mu =\)

Height
Bottom \(\hat\alpha_1=7\)
Middle \(\hat\alpha_2=16\)
Top \(\hat\alpha_3=\)
Width
Regular \(\hat\beta_1=-1\)
Wide \(\hat\beta_2=\)
Regular Wide
Bottom \(\widehat{\alpha\beta}_{11}=2\) \(\widehat{\alpha\beta}_{12}=\)
Middle \(\widehat{\alpha\beta}_{21}=-1\) \(\widehat{\alpha\beta}_{22}=\)
Top \(\widehat{\alpha\beta}_{31}=\) \(\widehat{\alpha\beta}_{32}=\)

Decomposition of Model Effects

Recall our statistical effects model: \(y_{ijk} = \mu +\alpha_i +\beta_j + \alpha\beta_{ij} + \epsilon_{ijk}\)

Model Diagnostics

par(mfrow = c(2,2))
plot(bakery_mod)

par(mfrow = c(1,1))